Generalized reinforcement learning in perfect-information games

نویسندگان

  • Maxwell Pak
  • Bing Xu
چکیده

This paper studies action-based reinforcement learning in finite perfectinformation games. Restrictions on the valuation updating rule that that are necessary and sufficient for the play to converge to a subgame-perfect Nash equilibrium (SPNE) are identified. These conditions encompass well-known examples of reinforcement learning and are mild enough to contain other interesting and plausible learning behavior. We provide examples of such updating rule that suggest that the extent of knowledge and rationality assumptions needed to support a SPNE outcome in finite perfect-information games may be minimal.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Utilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs

Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...

متن کامل

Reinforcement Learning in Perfect-Information Games∗

This paper studies action-based reinforcement learning in finite perfectioninformation games. Restrictions on the valuation updating rule that guarantee that the play eventually converges to a subgame-perfect Nash equilibrium (SPNE) are identified. These conditions are mild enough to contain interesting and plausible learning behavior. We provide two examples of such updating rule that suggest ...

متن کامل

A reinforcement learning process in extensive form games

The CPR (“cumulative proportional reinforcement”) learning rule stipulates that an agent chooses a move with a probability proportional to the cumulative payoff she obtained in the past with that move. Previously considered for strategies in normal form games (Laslier, Topol and Walliser, Games and Econ. Behav., 2001), the CPR rule is here adapted for actions in perfect information extensive fo...

متن کامل

Hierarchical Reinforcement Learning with Deictic Representation in a Computer Game

Computer games are challenging test beds for machine learning research. Without applying abstraction and generalization techniques, many traditional machine learning techniques, such as reinforcement learning, will fail to learn efficiently. In this paper we examine extensions of reinforcement learning that scale to the complexity of computer games. In particular we look at hierarchical reinfor...

متن کامل

Policy Learning in Imperfect-information Infinite Dynamic Games

Dynamic games (DGs) play an important role in distributed decision making and control in complex environments. Finding optimal/approximate solutions for these games in the imperfect-information setting is currently a challenge for mathematicians and computer scientists, especially when state and action spaces are infinite. This paper presents an approach to this problem by using multi-agent rei...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Int. J. Game Theory

دوره 45  شماره 

صفحات  -

تاریخ انتشار 2016